Overview of single‐cell RNA sequencing analysis and its application to spermatogenesis research

Abstract Background Single‐cell transcriptomics allows parallel analysis of multiple cell types in tissues. Because testes comprise somatic cells and germ cells at various stages of spermatogenesis, single‐cell RNA sequencing is a powerful tool for investigating the complex process of spermatogenesis. However, single‐cell RNA sequencing analysis needs extensive knowledge of experimental technologies and bioinformatics, making it difficult for many, particularly experimental biologists and clinicians, to use it. Methods Aiming to make single‐cell RNA sequencing analysis familiar, this review article presents an overview of experimental and computational methods for single‐cell RNA sequencing analysis with a history of transcriptomics. In addition, combining the PubMed search and manual curation, this review also provides a summary of recent novel insights into human and mouse spermatogenesis obtained using single‐cell RNA sequencing analyses. Main Findings Single‐cell RNA sequencing identified mesenchymal cells and type II innate lymphoid cells as novel testicular cell types in the adult mouse testes, as well as detailed subtypes of germ cells. This review outlines recent discoveries into germ cell development and subtypes, somatic cell development, and cell–cell interactions. Conclusion The findings on spermatogenesis obtained using single‐cell RNA sequencing may contribute to a deeper understanding of spermatogenesis and provide new directions for male fertility therapy.


| INTRODUC TI ON
For over 20 years since the human genome project, transcriptome analysis has been dramatically developed, supported by the development of next-generation sequencers and the growth of computer technologies. Recently, the resolution of transcriptome analysis has reached the single-cell level. The single-cell transcriptome analysis, such as single-cell RNA sequencing (scRNA-seq), can provide novel insights into the complex biological system in which many types of cells are involved, such as spermatogenesis. On the other hand, scRNA-seq analysis is composed of highly specialized skills in experimental molecular biology and bioinformatics, making people hesitate to approach it. In the first section, I overview the development of transcriptome analysis and describe basic knowledge of the widely accepted spermatogenesis.

| Gonadal development
The testes are male reproductive glands that support lifelong spermatogenesis. They first emerge as genital primordia, a common ancestor of the testes and ovaries, from the mesonephros on approximately embryonic day (E) 10 in mice and at 4 weeks of embryonic age in humans. 1 After E10.5 in mice, primordial germ cells (PGCs) migrate to gonadal primordia, which is differentiated from proximal epiblast cells upon bone morphogenetic protein (BMP) stimulation at around E6.5. 2 In the male gonadal primordium, SRY, a transcription factor encoded by a gene on the Y chromosome, emerges at around E11.5-12.0 in mice and 6 weeks of embryonic age in humans and activates the downstream genes necessary for testis development, [3][4][5][6][7][8] which determine the developmental fate of the genital primordium to testes. After the sex determination, a specialized structure of the testes forms ( Figure 1). The main testis component is seminiferous tubules that compose most of the testis. The inside of seminiferous tubules is lined with Sertoli cells as the epithelial layer, and spermatogenesis takes place at the boundaries between adjacent Sertoli cells, and spermatozoa are released from the epithelial layer to the testicular lumen. Other testicular somatic cells, including Leydig cells, peritubular myoid cells (PTMs), and immune cells are located at the interstitial space of a testis. Paracrine factors secreted from these testicular somatic cells and endocrine factors support the proper spermatogenesis, closely communicating with each other. In addition, migrated PGCs differentiate into prospermatogonia (ProSPG) (also called prespermatogonia or testicular gonocytes) in male fetal testes, along with sex determination. 2 ProSPG is initially mitotically active (M-ProSPG) and then progress to primary transitional (T1)-ProSPG, which are located at the center of the seminiferous tubule and are mitotically quiescent. Soon after birth, T1-ProSPG migrate to the basal lamina side of the seminiferous tubule and convert to secondary transitional (T2)-ProSPG, which initiate proliferation. 9 Then, some ProSPG differentiate directly to differentiating SPG and initiate spermatogenesis, skipping undifferentiated SPG stage, known as the first wave of spermatogenesis. 10 Other ProSPG on the other hand, become SSCs for lifelong steady-state spermatogenesis.

| Spermatogenesis
Spermatogenesis, a multistep process that generates mature sperm from spermatogonial stem cells (SSCs), takes place in the seminiferous tubules generally after puberty (steady-state spermatogenesis; Figure 1). On the other hand, unlike the postpuberty steady-state | 3 of 16 SUZUKI spermatogenesis, the first wave of spermatogenesis, which occurs soon after birth in rodents but not in humans, appears to originate from nonself-renewing SPG or directly from ProSPG but not from the self-renewing SSCs. 9 SSCs are a subset of spermatogonia (SPG) and maintain their pool by mitotic self-renewal and can also give rise to source cells for spermatogenesis via asymmetric cell division.
In mice, SSCs are considered identical or included in A-single spermatogonia, a subset of undifferentiated SPG. A-single spermatogonia give rise to two A-paired spermatogonia with an incomplete cell division that does not complete cytokinesis, interconnected via intercellular bridges to form a syncytium. Further incomplete cell divisions cause 4-16 cells of undifferentiated SPG (A-align spermatogonia). Undifferentiated SPG are then divided into approximately 512 cells as differentiating SPG (Type A1, A2, A3, A4, In, and B SPG). 11,12 In contrast, human undifferentiated SPG increase to four cells with two incomplete cell divisions. Only one cell incomplete division occurs during the differentiating SPG stage, generating eight cells, which are interconnected by intercellular bridges. 11 After mitotic divisions at the SPG stage, differentiating SPG convert into spermatocytes, initiating meiosis. 13,14 Meiosis prophase I is subdivided into leptotene, zygote, pachytene, diplotene, and diakinesis stages. Then, the cells undergo two meiotic divisions, resulting in round haploid spermatids, which undergo a morphological change, known as spermiogenesis, to form elongated spermatids and spermatozoa. 13,14 Finally, spermatozoa are released into the lumen of the seminiferous tubules. The germ cells from one SSC remain connected by cytoplasmic bridges until elongated spermatids. 15-17

| Testicular somatic cells
Sertoli cells are the "nurse cells" and the only somatic cell type in the seminiferous tubules, which directly contact germ cells in the seminiferous tubules. Sertoli cells are responsible for many critical functions for spermatogenesis. At the neonatal stage, Sertoli cells regulate the migration of gonocytes to the basement membrane of seminiferous tubules and stimulate their proliferation to establish the SSC pool. [18][19][20] Sertoli cells also construct a crucial structure, bloodtestis-barrier (BTB), which is specialized junctions between adjacent Sertoli cells composed of adherens, gap, and tight junctions. The BTB physically separates the abluminal compartment and lumen of seminiferous tubules from the basal compartment and interstitial space, maintaining the immune-privileged environment. Spermatogonial cell division and differentiation occur in the basal compartment, whereas miosis and postmiotic development occur in the abluminal compartment. Thus, spermatocytes must move through the BTB to enter meiosis, which is regulated by multiple signaling pathways such as TGFβ/Smad, and MAPK signaling pathways. 21 Leydig cells in the interstitial space are mainly responsible for steroidogenesis, such as testosterone, upon luteinizing hormone (LH) stimulation.
Testosterone regulates PTMs, Leydig, and Sertoli cells through the androgen receptor. 22 Leydig cells are also sources of other humoral factors essential for the proliferation of spermatogonial stem cells (SSCs), such as GDNF, CSF1, and IGF1. [23][24][25][26] Seminiferous tubules are surrounded by PTMs, which mainly function in seminiferous tubule contraction to transport spermatozoa. PTMs also play an essential role in maintaining SSCs by secreting glial cell line-derived neurotrophic factor upon testosterone-androgen receptor binding. [27][28][29] In addition to the major testicular somatic cells, immune cells, such as macrophages and natural killer cells, are known to reside in the interstitial space. Two subtypes of macrophages are identified in the interstitial space, one of which lines on the surface of seminiferous tubules, and the other is associated with vasculatures. [30][31][32] The macrophages located around the vascular are associated with vascular development and promoting steroidogenesis of Leydig cells. 33

| History of transcriptomics
In the 1990s, genome and cDNA projects were launched for many species, including humans and mice. [37][38][39][40][41] These projects have allowed researchers to analyze biological phenomena and diseases on a genome-wide level. These genome-wide comprehensive analyses, including genomic, transcriptomic, and epigenomic analyses, are known as omics analyses. Among omics analyses, transcriptomics, in which whole transcripts of a sample are obtained and analyzed, is the most powerful tool to delineate the overall picture of biological phenomena at the molecular level. The initial stage of transcriptomics utilized expressed sequence tags (EST). 42 EST represents sequences of several hundred bases from the 5′-or 3′-ends of the cDNA clones in a library, roughly reflecting transcript abundance in the library.
Because one sequencing reaction in a standard sanger sequencer typically yields one EST, EST analysis needs many sequencing reactions to obtain enough quantitativity.

| Single-cell RNA sequencing library construction
The first step in scRNA-seq is cell dissociation to release individual cells. The standard cell dissociation method, a traditional method used for many applications, such as flow cytometry, uses tissue dissociation enzymes, including trypsin and collagenase, at 37°C. Because the distinct characteristics of different cell types and tissue types, such as stiffness and size, affect cell dissociation, this step may damage and deplete specific cell types, significantly affecting the outcome of scRNA-seq data. 47 Therefore, the cell dissociation protocol should be carefully optimized for the specific tissue type.
In addition, a recent report suggested that this dissociation method induces a stress response at the gene expression level, affecting the scRNA-seq results. 48,49 On the other hand, cell dissociation on ice, which can avoid the stress response occurring during cell dissociation at 37°C, using a cold-active protease, provides a better approach for scRNA-seq. 49 After the cell dissociation, several approaches can be applied to capture single cells ( Figure 2). For the analysis targeting a relatively low number of cells (~several hundred cells), dissociated individual cells are captured by manual pipetting from cell suspension or cell sorting and are isolated to individual wells of 96-or 384-well microtiter plate. Microfluidic systems, such as Fluidigm® C1 platforms, are also used to capture the individual cell. 50,51 Laser microdissection, which directly captures single cells from a tissue section without cell dissociation, is another option for single-cell capture. 52 In contrast, microwell-and droplet-based approaches are used to analyze more than 1000 cells. 51 Because microwell-and droplet-based approaches require specialized instruments, commercialized systems are provided, including the ICELL8® cx Single-Cell system from Takara Bio, the Nadia Innovate system from Dolomite Bio, and the Chromium system from 10X® Genomics. In particular, the Chromium system is a currently widely used platform. 53 While microwell and dropletbased approaches require specialized instruments, the split-pool combination barcoding method (known as SPLiT-Seq) can analyze tons of single cells without customized equipment, lowering the hurdles of the high-throughput scRNA-seq. 54 The captured single cells are then subjected to RNA-seq library construction. The sequencing libraries are either full-length, 3′-end, or 5′-end cDNAs. Full-length cDNA libraries such as SMART-seq® methods 55-59 detect RNA processing events, such as alternative and aberrant RNA splicing. 60 While SMART-seq® methods use dT primer for the reverse transcription, the random displacement amplification sequencing (RamDA-seq®) utilizes random primer, detecting nonpoly-A transcripts, such as noncoding RNA. 61 Because of the small amount of RNA from a single cell, cDNA amplification is typically required for scRNA-seq library preparation. PCR is typically utilized for cDNA amplification, but PCR unevenly amplifies the cDNA due to the difference of length, GC content, and stochasticity. Therefore, a large part of scRNA-seq protocols including MARS-seq, 62 Dropseq, 63 inDrop, 64 CEL-seq2, 65 SCRB-seq, 66 Q uartz-seq2, 67 SMART-seq3, 56 and FLASH-seq 59 adopt a unique molecular identifier (UMI) to ensure quantifiability. 68,69 UMI is a random sequence tag incorporated into the sequencing library before PCR amplification.
Theoretically, every single molecule has a UMI. Therefore, the amplification bias can be compensated by counting unique UMI sequences. Thus, many scRNA-seq library construction methods were established. Representative methods were benchmarked in several reports, providing clues to select an optimal library construction method for each experiment. [70][71][72] F I G U R E 2 Workflow of single-cell RNA-seq analysis. Tissue is dissociated and subjected to single-cell capturing. Single cells are captured by either manual pipetting, cell sorting, droplet, microfluidics, microwell, or laser microdissection. Representative systems for each cell capturing system are shown. A scRNA-seq library is constructed and sequenced using next-generation sequencing. Bioinformatic analyses, such as clustering, dimensionality reduction, pseudotime/trajectory, RNA velocity, marker expression, and cell-cell interaction analyses, are performed using scRNA-seq. Several examples of R or Python packages for each analysis are shown. The numbers in brackets are reference numbers of each package.

| Computational analysis of scRNA-seq data
As well as bulk RNA-seq, the scRNA-seq library is sequenced using a short read sequencer such as Hiseq (Illumina) (Figure 2). Then, combining bioinformatic techniques, the comprehensive parallel transcriptome data generated using scRNA-seq provides many biological aspects ( Figure 2). Currently, several useful integrated scRNA-seq analysis packages, such as Seurat and Scanpy are provided as R or Python packages, which provide typical workflow of scRNA-seq analysis. 73 of the target sample. Therefore, although the cost of scRNA-seq is currently very high, taking biological replicates is recommended. 92,93 In addition to conventional analyses, characteristics analyses of scRNA-seq data are also employed to discover novel insights. In particular, pseudotime/trajectory analysis is one of the unique analyses of scRNA-seq. 83 Thus, although the quantifiability of scRNA-seq is lower than that of bulk RNA-sequencing because of the lower sequence depth per cell, scRNA-seq is a powerful tool for analyzing complex tissues.
Recently, because the testes consist of many cell types and germ cells at different differentiation stages, scRNA-seq has been used to investigate spermatogenesis. The following sections provide a summary of novel insights into spermatogenesis recently reported using single-cell transcriptomics.

| S ING LE-CELL TR ANSCRIP TOMIC S IN S PERMATOG ENE S IS
The number of reports about single-cell transcriptomics in spermatogenesis is progressively increased after 2018. As of December, 18, 2022, a PubMed search with the words "spermatogenesis", "single cell", and "RNA sequencing" identified 58 original articles, excluding review, comment, or data descriptor articles. Of 58 articles, 46 were about humans or mice. Thus, currently most of the scRNA-seq for spermatogenesis research has been done in humans or mice and more reports are required to discuss other species such as goats and rats. Therefore, based on the PubMed search and manual curation, I summarize recent novel findings about human and mouse spermatogenesis in this section.

| Discovery of novel testicular cell types
scRNA-seq can identify cell types that have not been previously identified. Using scRNA-seq of whole testes of adult C57BL/6J mice, transcription factor 21 (TCF21)+ mesenchymal cells were identified as a novel testicular somatic cell type. 104 Rank correlation analysis revealed that the TCF21+ mesenchymal cell population is reminiscent of an embryonic interstitial cell progenitor. 104   However, a few mitotically active I-ProSPGs (ELMO1+ PALLD+) were identified at the periphery of the seminiferous tubules.

| Subtypes in the spermatogonial stem cell population
Because SSCs can potentially recover the fertility of childhood cancer survivors by transplantation, it must be important to identify specific markers to isolate SSCs. 123 SSCs are recognized to express specific prototypical SSC markers, such as ID4, GFRA1, and NANOS3. However, a growing body of evidence indicates the heterogeneity of SSCs. [124][125][126][127][128][129][130][131][132][133][134] In mice, single-cell qRT-PCR analysis suggested that even the ID4+ SPG population is heterogeneous and the ID4 bright SPG population is an SSC-enriched population. 135    They also identified novel subtypes of SSC and differentiated SPG in humans. 111 Transition cells (TCs), a distinct cell subset in human adult testes, appear to represent the transition status between undifferentiated and differentiating SPG. 111 Supporting the nature of TCs, and the status between infrequent and active cell proliferation, TCs specifically express CCND2 and SPRY1, cell proliferation-promoting proteins. 140,141 Together, analysis of scRNA-seq data revealed the trajectory of spermatogonial development ( Figure 4) and consistently reported a primitive SSC population, which can potentially be applied to treat male infertility. Importantly, because a particular primitive SSC marker is yet to be established in humans, surface proteins, such as LPPR3 and TSPN33, are helpful in enriching the most primitive SSCs.

| The behavior of sex chromosomelinked genes
During male meiosis, asynapsis of sex chromosomes induces meiotic sex chromosome inactivation (MSCI), forming an XY body. [142][143][144][145][146][147][148] These inactivated sex chromosome-linked genes are partially reactivated during spermiogenesis, but another sex chromosome inactivation occurs because of postmeiotic sex chromatin (PMSC) formation during the late spermatid stage. 149 The series of sex chromosome-linked gene regulations was quantitatively and temporally confirmed using scRNA-seq. 104,[112][113][114][118][119][120]147 The sex chromosome-linked genes are highly expressed in SPG, and their expression dramatically decreases during meiosis, followed by reactivation after meiosis. Although the precise timing of MSCI is controversial, scRNA-seq analyses suggested that MSCI occurs during meiosis prophase I in both mice and humans. 114,119,147 Interestingly, although the timing and silencing degree of MSCI is similar between X-and Y chromosome-linked genes, Y chromosome-linked gene silencing appears to occur before X chromosome-linked gene silencing, and the silencing degree is more intense than X chromosome-linked genes in PMSC, suggesting that transcripts from Y-linked genes are more unstable. 120,147 Haploid spermatids have either X or Y chromosomes as a result of meiosis. Several studies have suggested that X and Y chromosome-

| Testicular somatic cell development
Testicular somatic cell development and detailed characteristics have also been determined using scRNA-seq in mouse testes.
It has been suggested that fetal Leydig cells differentiate from the interstitial progenitors. 154 Fetal Leydig cells are believed to be gradually replaced by functionally distinct adult Leydig cells after birth 162 but several reports have claimed that a small population of fetal Leydig cells persists in the postnatal stage in mice and rats. [163][164][165][166][167] scRNA-seq analysis by Ernst et al. 147 showed that most fetal Leydig cells were replaced by adult Leydig cells by P15 in mice. Tan et al. 109 also suggested that murine postnatal Leydig cells differ from prenatal Leydig cells. Thus, scRNA-seq did not detect persistent fetal Leydig cells after birth. The fate of fetal Leydig cells in postnatal testes requires more careful analysis, which may be available by reusing published scRNA-seq datasets.
Combined with the finding that TCF21 + mesenchymal cells seem to be the common precursors of Leydig and PTMs and that they may be derived from interstitial precursors, somatic cell development is summarized in Figure 5.

| Seminiferous tubule cycle
Spermatogenesis is orchestrated by periodically cycling through the 12 stages of the seminiferous epithelial cycle (I to XII). The seminiferous epithelial cycle is canonically defined by the distinct set of germ cell types observed in the seminiferous tubule cross-section. 168 [173][174][175][176] Because the adult Sertoli cell is large (larger than 300 μm 3 in mice), droplet-based scRNA-seq for adult testes frequently depleted Sertoli cells. Therefore, enrichment of Sertoli cells before scRNA-seq using transgenic mice, such as SOX9-GFP transgenic mice, is applied to obtain a sufficient number of adult Sertoli cells for scRNA-seq analysis. 104,177 These scRNA-seq analyses segregated adult Sertoli cells into several clusters, which correspond to seminiferous epithelial stages, indicating differential transcriptomes between stages. Interestingly, stages VI-VIII were consistently clearly segregated as a cluster, suggesting that Sertoli cells at the stages of retinoic acid pulse show noticeable transcriptome characteristics.

| Cell-cell interaction
Cells in a tissue intercommunicate through direct contact, gap junctions, and paracrine signaling using hormonal factors. Taking advantage of the parallel transcriptome data of all cell types in a tissue, scRNA-seq data predict the cell-cell interactions based on the expression pattern of receptors and corresponding ligand pairs. [178][179][180] For example, although previous reports illustrated that Notch signaling in mouse Sertoli cells promotes the differentiation of quiescent proSPG, 181,182 scRNA-seq revealed that receptors of Notch signaling, such as Notch2 and Notch3, are expressed in nearly all testicular somatic cell types, including Sertoli, Leydig, PTMs, and endothelial cells. 108,109,111,112,116,117 Ligands of Notch signaling, such as DLL1, DLL2, DLL4, and DLK1, are also expressed in almost all testicular cell types, including somatic and germ cells. 108,109,111,112,116 Thus, Notch signaling plays a broader role in spermatogenesis than is currently known.
Activin signaling is essential in Sertoli cell differentiation and function. [183][184][185] Ligands of activin signaling, activin, and inhibin, have been reported to be expressed in Sertoli cells, Leydig cells, PTMs, and germ cells. 186,187 scRNA-seq analyses confirmed that activin/inhibin is expressed in stromal, PTMs, Sertoli, and Leydig cells. 108,109,112 On the other hand, activin receptor genes (Acvr1b, Bmpr1b, and Acvr2b) are expressed in SPG, whereas their inhibitor genes, Fst, Bambi, and Nog are selectively expressed in undifferentiated SPG. Thus, activin signaling appears to play an essential role in spermatogonial differentiation. 108 In addition to the Notch and activin signaling pathways, other pathways for ligand-receptor pairs, including retinoic acid, kit, Gndf, Wnt, Fgf, and Hedgehog pathways, have been suggested to be involved in spermatogenesis by scRNA-seq. Thus, these novel findings of cell-cell interactions obtained by scRNA-seq data should be further validated.

| Pathological analysis using scRNA-seq
scRNA-seq is also helpful for pathological analysis, especially in human biopsy samples. Nonobstructive azoospermia (NOA) is a major pathological cause of male infertility. Wang et al. compared scRNA-seq data from a patient with NOA and healthy testes and identified a panel of differentially expressed genes in Sertoli cells. 114 Since the identified differentially expressed genes were largely involved in spermatogenesis, these genes are potential drug targets, although only one NOA patient was tested. Zhao et al. 188 also observed a substantial difference between normal and NOA Sertoli cells. Furthermore, they reported differences among NOA types (idiopathic NOA, Klinefelter syndrome, and Y chromosome AZF region microdeletion). Importantly, using pathway analysis of idiopathic NOA and normal adult Sertoli cells, Wnt signaling was found to be a possible drug target for idiopathic NOA therapy.

| FUTURE PER S PEC TIVE S
Due to the development and application of NGS, substantial transcriptome data can be obtained at a relatively low cost. Accordingly, many transcriptome datasets are available in public data repositories, and the number of datasets is increasing daily. Because transcriptome data, in particular scRNA-seq, are comprehensive, the dataset can be reused to analyze facets of biological questions other than those included in the original analysis. Furthermore, combinatorial analysis of multiple datasets obtained from independent studies may also provide new insights. In addition, because one of the primary roles of transcriptome analyses is to provide fundamental material for hypothesis building, showing the overall description of the biological phenomena, further experimental validations based on the transcriptome data-derived hypothesis are essential.
scRNA-seq methods are still being developed. For instance, current scRNA-seq techniques are combined with surface marker expression profiling, known as cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) or RNA expression and protein sequencing (REAP-seq). 189,190 In these methods, the cells are first labeled using oligonucleotide-conjugated antibodies for the target surface markers. The oligonucleotides include antibody-specific barcodes with poly-A tails to allow their capture using polydT primers in the reverse transcription reaction of scRNA-seq library construction. Thus, the surface marker expression profile can be measured by counting antibody-derived barcode sequences. Furthermore, single-cell multiomics analyses have been developed, such as the simultaneous analysis of transcriptome and chromatin accessibility. 191 These cutting-edge analyses have provided valuable resources and important insights into spermatogenesis.
A drawback of scRNA-seq is that it does not provide spatial information. Recently, spatial transcriptome analysis, which analyzes the transcriptome of the tiny regions of a tissue section following immunohistochemistry or fluorescence in situ hybridization, has made it possible to simultaneously analyze subcellular to single-cell level transcriptome and spatial information. 192 This in situ capture-based approach uses a glass slide that is densely printed with small probe spots. Because probes in each spot have a unique barcode with poly dT sequence, the resulting cDNA with the barcode simultaneously provides transcriptome data with spatial information by placing a tissue section on a glass slide followed by cell permeabilization and reverse transcription. Although reports of spatial transcriptome analysis for spermatogenesis are still few, 193 the number of the study must increase shortly, providing essential insights.
Single-cell transcriptome analyses targeting spermatogenesis have provided valuable data and novel insights. As single-cell technologies and their analysis algorithms are still developing, they are expected to contribute to understanding spermatogenesis.

CO N FLI C T O F I NTE R E S T
Takahiro Suzuki declares that he has no conflict of interest.

H U M A N/A N I M A L R I G HT S S TATE M E NT A N D I N FO R M ATI O N
This article does not contain any studies with human and animal subjects performed by the author.